asymptotic behavior
Asymptotic Behaviors of Projected Stochastic Approximation: A Jump Diffusion Perspective
In this paper, we consider linearly constrained stochastic approximation problems with federated learning (FL) as a special case. We propose a stochastic approximation algorithm named by LPSA with probabilistic projections to ensure feasibility so that projections are performed with probability $p_n$ at the $n$-th iteration. Considering a specific family of the probability $p_n$ and step size $\eta_n$, we analyze our algorithm from an asymptotic and continuous perspective. Using a novel jump diffusion approximation, we show that the trajectories consisting of properly rescaled last iterates weakly converge to the solution of specific SDEs. By analyzing the SDEs, we identify the asymptotic behaviors of LPSA for different choices of $(p_n, \eta_n)$. We find the algorithm presents an intriguing asymptotic bias-variance trade-off according to the relative magnitude of $p_n$ w.r.t.
Lift What You Can: Green Online Learning with Heterogeneous Ensembles
Köbschall, Kirsten, Buschjäger, Sebastian, Fischer, Raphael, Hartung, Lisa, Kramer, Stefan
Ensemble methods for stream mining necessitate managing multiple models and updating them as data distributions evolve. Considering the calls for more sustainability, established methods are however not sufficiently considerate of ensemble members' computational expenses and instead overly focus on predictive capabilities. To address these challenges and enable green online learning, we propose heterogeneous online ensembles (HEROS). For every training step, HEROS chooses a subset of models from a pool of models initialized with diverse hyperparameter choices under resource constraints to train. We introduce a Markov decision process to theoretically capture the trade-offs between predictive performance and sustainability constraints. Based on this framework, we present different policies for choosing which models to train on incoming data. Most notably, we propose the novel $ζ$-policy, which focuses on training near-optimal models at reduced costs. Using a stochastic model, we theoretically prove that our $ζ$-policy achieves near optimal performance while using fewer resources compared to the best performing policy. In our experiments across 11 benchmark datasets, we find empiric evidence that our $ζ$-policy is a strong contribution to the state-of-the-art, demonstrating highly accurate performance, in some cases even outperforming competitors, and simultaneously being much more resource-friendly.
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- Europe > Switzerland (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (14 more...)
- Education > Educational Setting > Online (1.00)
- Government (0.93)
Supplementary material A Experimental details
We are using JAX [ Bradbury et al., 2018 ]. All the models except for section C.4 have been trained with Softmax loss normalized as Batch Norm: we are using JAX's Stax implementation of Batch Norm which doesn't keep track of Trained on 512 samples of MNIST. MaxPool((2,2), 'V ALID') performs max pooling with'V ALID' padding Trained on CIFAR-10 without data augmentation. The WRN experiments are run on v3-8 TPUs and the rest on P100 GPUs. Here we describe the particularities of each figure.
Fragment size density estimator for shrinkage-induced fracture based on a physics-informed neural network
This paper presents a neural network (NN)-based solver for an integro-differential equation that models shrinkage-induced fragmentation. The proposed method directly maps input parameters to the corresponding probability density function without numerically solving the governing equation, thereby significantly reducing computational costs. Specifically, it enables efficient evaluation of the density function in Monte Carlo simulations while maintaining accuracy comparable to or even exceeding that of conventional finite difference schemes. Validatation on synthetic data demonstrates both the method's computational efficiency and predictive reliability. This study establishes a foundation for the data-driven inverse analysis of fragmentation and suggests the potential for extending the framework beyond pre-specified model structures.
Asymptotic behavior of eigenvalues of large rank perturbations of large random matrices
Afanasiev, Ievgenii, Berlyand, Leonid, Kiyashko, Mariia
Random Matrix Theory (RMT) is a classical theory that has been developing for more than 70 years. Initially, RMT arose from problems in nuclear physics and found its applications in mathematics, physics, finance, and many other disciplines. Recently, new problems have been arising from the area of Machine Learning. Indeed, often the weight matrices of Deep Neural Networks (DNNs) are initialized randomly. Moreover, modern DNNs have large weight matrices, which is why their spectral properties can be described by asymptotic behavior of N N random matrices as N goes to infinity.
- North America > United States > Pennsylvania (0.04)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- North America > United States > Virginia (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China (0.04)